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This paper develops a formulation of the quasispecies equations appropriate for polysomic, semi- 
conservatively replicating genomes. This paper is an extension of previous work on the subject, 
which considered the case of haploid genomes. Here, we develop a more general formulation of the 
quasispecies equations that is applicable to diploid and even polyploid genomes. Interestingly, with 
an appropriate classification of population fractions, we obtain a system of equations that is formally 
identical to the haploid case. As with the work for haploid genomes, we consider both random and 
immortal DNA strand chromosome segregation mechanisms. However, in contrast to the haploid 
case, we have found that an analytical solution for the mean fitness is considerably more difficult to 
obtain for the polyploid case. Accordingly, whereas for the haploid case we obtained expressions for 
the mean fitness for the case of an analogue of the single-fitness-peak landscape for arbitrary lesion 
repair probabilities (thereby allowing for non-complementary genomes), here we solve for the mean 
fitness for the restricted case of perfect lesion repair. 
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I. INTRODUCTION 

The quasispecies theory of evolutionary dynamics was 
originally introduced in a now-classic paper by Manfred 
Eigen in 1971 PQ. In this paper, Eigen developed a sys- 
tem of ordinary differential equations that were meant to 
describe the evolutionary dynamics of replicating polynu- 
cleotide or polypeptide chains. The goal was to develop a 
mathematical framework that would be suitable for mod- 
eling the evolutionary processes relevant to the origin of 
life. Much of the subsequent work by Eigen on quasis- 
pecies theory was done in collaboration with Peter Schus- 
ter, which is why the quasispecies equations are often 
referred to as the Eigen-Schuster equations [2J. 

In brief, the quasispecies model considers a population 
of genomes, defined as single-stranded sequences, taken 
to be of length L. A given sequence, denoted a, may 
be expressed as a = S1S2 ■ ■ ■ sl, where each S{ represents 
a "letter" or "base" that is chosen from an alphabet of 
size S (for all known terrestrial life, 5 = 4, though many 
phcnomcnological studies work with S — 2 for simplicity) 

Ei iai m ei . 

With each a is associated a first-order growth rate 
constant, denoted by K a . The mapping K : a — > n a 
defines what is known as the fitness landscape. During 
replication, it is assumed that a daughter strand is pro- 
duced from the template parent strand. Replication is 
not necessarily error-free, which gives rise to a transition 
probability p m {a,a'), denoting the probability that par- 
ent strand a produces the daughter & '. The quasispecies 
equations may then be expressed as [3J |3J |U |S] , 

= ^2 K <y'VmW i v)x<j' - k(£)ie<7 (1) 
a' 

Here, x a denotes the fraction of organisms in the popu- 
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lation that have genome a, and n(t) = n, G x a is the 
mean fitness of the population. 

The central result of quasispecies theory is a phe- 
nomenon known as the error catastrophe. The error 
catastrophe refers to a localization to de-localization 
transition over the genome sequence space once mutation 
rates have crossed a critical threshold, naturally termed 
the error threshold. Below the error threshold, natural 
selection is sufficiently strong to localize the population 
distribution to a "cloud" of related strains, termed a qua- 
sispecies. Above the error threshold, natural selection 
is no longer able to counteract mutation-accumulation, 
and the result is evolutionary dynamics governed by es- 
sentially random genetic drift. Over time, the popu- 
lation distribution completely de-localizes over the se- 
quence space, and no identifiable quasispecies emerges. 

Although the origin-of-life problem was the original 
motivation for the development of quasispecies theory, 
the quasispecies concept has found broad application in 
the field of virus evolutionary dynamics. The reason for 
this is that many RNA viruses, such as HIV, have suf- 
ficiently high mutation rates that they exhibit a fairly 
broad distribution of genotypes, so that the quasispecies 
concept is highly relevant for these systems. However, be- 
cause the quasispecies equations may be readily adapted 
toward modeling evolution in more complex systems, in 
recent years there have been efforts to develop quasis- 
pecies theory into a useful framework for modeling the 
evolution of cell-based life. Understanding evolution at 
the cellular level will have applications in areas such as 
antibiotic drug resistance in bacteria, immune system 
function, stem cells, and the somatic evolution of can- 
cer. 

Some of the work that has been done in quasis- 
pecies theory to make it suitable for modeling biolog- 
ical systems more complex than molecules and viruses 
includes the following: (1) Developing a formulation of 
the quasispecies model that is appropriate for double- 
stranded, semiconservatively replicating DNA genomes 
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[BJ. (2) Analysis of quasispecies dynamics for mult i- gene 
genomes, which, among other results, revealed that the 
error catastrophe is a special case of a more general phe- 
nomenon that was termed an "error cascade" [7J. (3) 
Using quasispecies theory to model evolution in dynamic 
environments, and to study the co-evolutionary dynamics 
that arises from the immune response to a viral infection 
[SI IS] ■ (4) Modeling mutation-propagation in stem and 
tissue cells [TU]. (5) Modeling genetic repair and repair- 
deficient strains known as mutators [TT| [121? ]. (6) In- 
corporating Horizontal Gene Transfer and recombination 
into quasispecies theory [121 HE1 HZ] • 

Additionally, other recent work on quasispecies theory 
has included developing quasispecies equations appropri- 
ate for describing polysomic genomes |18j . Given that 
cellular genomes are generally composed of several chro- 
mosomes, such a formulation of the quasispecies model 
is a necessary extension for developing realistic models of 
the evolutionary dynamics of cellular populations. How- 
ever, the work on polysomic genomes only considered 
haploid genomes. Here, in this work, we generalize the 
quasispecies equations for polysomic genomes to allow 
for polyploid genomes. We do not use our equations to 
model a specific biological system in this paper. Nev- 
ertheless, we obtain analytical results for the polysomic 
analogue of the single-fitness-peak landscape, which is 
the simplest and most commonly studied fitness land- 
scape in quasispecies theory. These analytical results are 
in agreement with results obtained from stochastic sim- 



ulations, suggesting that the equations developed here 
may be suitable for modeling evolutionary processes in 
real systems. 



II. THE MODEL 

A. The Finite Sequence Length Equations 

We consider a population of asexually replicating or- 
ganisms, each of which is characterized by a genome con- 
sisting of N chromosomes. Unlike our previous paper [? 
], we do not assume that the chromosomes are necessar- 
ily distinguishable, so that we do not impose any kind of 
chromosome ordering. Thus, a given genome, denoted a, 
may be written as a — {{&%, . . . , {oat, cr' N }}, where 

{cr,:, &[} denotes the pair of DNA strands of the i^ 1 chro- 
mosome. We also assume that the organisms replicate at 
a rate characterized by a genome-dependent first-order 
growth rate constant n„. 

Furthermore, we let p((a";a"),{a,a'}) denote the 
probability that strand cr", as part of genome a", 
becomes, after daughter strand synthesis and post- 
replication lesion repair, chromosome {a, a'}. We also 
let p((a"; a"), (cr, cr')) denote the probability that strand 
cr", as part of genome a", becomes, after daughter strand 
synthesis and post-replication lesion repair, strand cr, 
with daughter strand cr'. It should be noted that, 



r f P((a";a"),(a,a'))+p((a";a"),(a',a)) if a + cr' 

p{{a , a ), {a, a I) - j p((CT „. ^ if a = cr' 



(2) 



1. Random chromosome segregation 

We first consider the case of random chromosome seg- 
regation. Given a population of replicating organisms, we 
let x% denote the fraction of the population characterized 
by the genome a. Our goal is to develop an expression for 
dx„ jdt. To do so, we note that the expression for dx„ j dt 
consists of three separate terms: (1) A destruction term, 
corrresponding to the effective destruction of the parent 



genome as a result of semiconservative replication [? ]. 
(2) A mean-fitness normalization term, that arises when 
converting the dynamical equations expressed in terms of 
population numbers into dynamical equations expressed 
in terms of population fractions. (3) A mutation contri- 
bution term, summing the contribution to x& from the 
various genomes in the population. From Appendix A, 
we have that, 



dxa 
~dt 



-(«(*) + K&)x$- + 



2 N-1 



E 



K(7"Xrr" X 



ct" = {{ctJ', ( t;"},...,{(T^,ct^'}} 



N 



I 



(3) 



where ttn denotes a permutation of the indices 1, . . . ,N, tions that gives rise to distinct vectors of strand-pairs 
and TTiy(a) denotes the subset of all such permuta- 
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({f** (l), K N (i) }.■■■> W*x (N), < N (jv)}) obtained from 
the genome a = {{oi, £>i}, . . . , {cat, ct^}}. 

We may switch from an unordered chromosome rep- 
resentation of the genome, to an ordered one, as fol- 
lows: Given a genome {{ai,o~[}, . . . ,{o~n,ct' n }}, let m 
denote the number of distinct strand-pairs. Then we 
may write that this genome consists of the m distinct 
strand-pairs {a^ , a' t }, . . . , {cr im , a\ }, where the strand- 
pair {<7i fc , tr^ } appears nk times, so that n\ H 1- n m 

Y. 

Note that there are iV!/(ni! x ■ ■ • x n m !) distinct per- 
mutations of ({o"i, cr^}, . . . , {crjv, cr^f}), so define, 



x o=({o 1 ,a' 1 },...,{a N ,o' N }) — 



rc,i! x • • • x n„ 1 



An 

We obtain, again following the derivation provided in 
Appendix A, 



dx 



1 



di («(t) + n d )x d + j- K<r"X<r» x 

a" 

N 

nb(«; *"), fo> + p(W"; *"), te, <#)] 

i=l 

(5) 

where a"), {<ii, a[}) denotes the probability that 

parent strand a", as part of genome a", becomes, after 
daughter strand synthesis and lesion repair, chromosome 

Oil- 
Proceeding as with the case of haploid genomes, we 
may define a vector of ordered strand-pairs population 
fraction via the definition. 
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((cri,o-[),...,(cr N ,a' N )) 



(6) 



where k denotes the number of chromosomes for which 
a - , =/= a[. As with the case for haploid genomes, we obtain 
from Appendix A that, 



dxg 



-(«(*) + n 3 )x 3 + 2jv3T X! K #" x #" x 

a" 

N 

<?"), fa> ^)) + p(W; ?'), K, <*))] 



i=l 



2. Immortal DNA strand co-segregation 



Immortal DNA strand co-segregation is a chromosome 
segregation mechanism whereby one of the daughter cells 
receives all of the chromosomes containing the oldest 
DNA template strands of the previous replication cycle. 
It is a chromosome segregation mechanism that was hy- 
pothesized to be at work in adult stem cells [13], as a 
way to reduce the accumulation of mutations in stem 
cells. Immortal DNA strand co-segregation has been ex- 
perimentally confirmed (2Qj [21] • Interestingly, there is 
evidence to suggest that even unicellular organisms, such 
as Saccharomyces cerevisiae, may exhibit immortal DNA 
strand co-segregation [22]. As a result, we will develop 
the quasispecies equations for immortal strand segrega- 
tion as well. 

To derive the equations for immortal DNA strand co- 
segregation, we first note that a given DNA strand in 
a genome is either newly synthesized, or it has gone 
through a previous replication cycle where it was a tem- 
plate strand. Once a DNA strand is a template strand, 
then it remains a template strand throughout all suc- 
cessive replications. Given a strand a, we let <tW de- 
note a strand that is "new," that is, it has never been 
a template strand, and we let denote a strand that 
has been a template strand at least once. Since a chro- 
mosome that was produced in a replication cycle must 
consist of exactly one template and one new strand, a 
given chromosome is either of the form {cr( N \ cr'W}, or 
{^.a'W}. 

We also note that a given genome consists entirely of 
chromosomes containing only new strands, or entirely 
of chromosomes containing one template and one new 
strand. For if one chromosome contains a template 
strand, then that strand must have come from a par- 
ent cell in a previous replication cycle. This parent cell 
must have had N — 1 other parent strands coming from 
N — 1 other chromosomes that segregated into the daugh- 
ter cell. Therefore, the other chromosomes of the genome 
must contain a template strand as well. 

Given a genome a, we let a-W-^) signify that the 
genome consists entirely of new strands, and we let 



signify that the genome consists of chromosomes 



(7) 



j(T/N) 

containing exactly one template and one new strand. We 
then have, from Appendix A, 
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-(«(£) + n a )x a( N/N) 



dt 

dX & (T/N) . . . . 1 

= -(«(*) + K^j^tT/jv) + -^33- 



E 



JV 



e nw( <7 ^")'(%W'<»(.)))+p((^v").(%(«).<(.)))] 

7r Jv e7r JV (<T( T / N ) »=1 
+ 

AT JV 

E [nKK;^),K J vw>< w )) + IlKK // ;^),K JV w>^( i )))] 

■K N eTT N (cr {T,N '> i=1 1=1 

I 



(8) 



As with the random segregation equations, we can de- for the immortal DNA strand equations. We obtain, from 
fine an ordered chromosome formulation of the dynamics Appendix A, 



dx a{ 



N/N) 



dt 

dXfr(T/N) 



= -(«(*) + n a )x d {N/N) 

= -(«(*) + K a )x ai T/N) + X 



JV 

5-"(N/N) i=l 

JV JV 

g-"(T/JV) j=l j=l 



(9) 



Now, define an ordered strand-pair formulation of the 
dynamics as follows: Define 



and 



and 



y 3 {N/N) = -j:X a {N/N) 



y 3 {T/N) = X a (T/N) 



(10) 



(11) 



(12) 



where k denotes the number of chromosomes with dis- 
tinct strands in the genome. We then have, from Ap- 



pendix A, 

dys 
dt 



-(/c(t) + ng)ys + 2J HS"Va" 



JV 



JV 



[nKK;0,(^,^)) + IlKK / ;0,(^^i))] 

1=1 8=1 

(13) 

3. Complementarity symmetry 

It is interesting to note that even when we do not as- 
sume that the genomes are necessarily haploid, it still 
follows that it is possible to derive an ordered strand-pair 
formulation of the dynamics that is identical to the hap- 
loid case [13]. We should therefore note that in the case 
of haploid genomes, we made an additional assumption 
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regarding the fitness and error landscapes that allows for 
a convenient representation of the dynamics. We make 
the identical assumption in this paper, and obtain a sim- 
ilarly convenient representation of the dynamics for the 
general case. 

Following [? ], we begin by defining two operations 
r and 7, acting on ordered strand-pairs, as follows: 
t(cj, a') — {a 1 , a), and 7(17, a') = (a, a'), where a denotes 
the strand complementary to a (because DNA is antipar- 
allel, then if a = b\ . . .b^, and if bi denotes the base 
complementary to bi, we then have a = b^ . . .b\. Fur- 
thermore, given some vector of ordered strand-pairs a = 
((ax,a[), (a N ,a' N )), and a vector s = (s x , . . . , s N ), 
with each Sj = 0, 1, we make the following definitions: 

T S a=(T s Ha 1 ,a[),...,T s »(a N ,a' N )) 

1 Sa=(j s -(ai,a' 1 ),..., 7 s »(a N ,a' N )) (14) 

Now, note that the fitness landscape is symmetric un- 
der r, that is n T s s — Kg for all s G {0,1}^. We also 
assume that the fitness landscape satisfies a complemen- 
tarity symmetry, that is, k 7 ?j = Kg for all s G {0, 1}^. 
The idea behind this assumption is that because tak- 
ing the complement of a strand essentially amounts to 
a relabelling of the bases and a change in the order in 
which those bases are read, without any kind of specific 
sequence information there is no reason a priori to as- 
sume that a complementarity symmetry does not hold. 
Note that for a strand pair of the form (a, a), we have 
that 7(17,(7) = r(cr, a). Therefore, for genomes consisting 
of entirely of chromosomes comprised of perfectly com- 
plementary strands, we have that 7 s a = r s a, and so the 
complementarity symmetry automatically holds. 

We further assume that the transition proba- 
bility c?"), (<7j, obeys a complementarity 
symmetry, that is, p((7 Si crf; 7 V'), 7 s *(o"j, = 
p((a" ;<?"), (cTi, o-[)). Such a condition can be accom- 
plished if we assume that mutations are due to a base- 
independent mismatch probability eg, which obeys the 
complementarity symmetry. 

It may be shown that a population distribution that 
initially obeys the complementarity symmetry will obey 
this symmetry for all time, assuming that the fitness 
landscapes and transition probabilities obey this sym- 
metry. Because this derivation was already done in |18j . 
we will not repeat it here. Furthermore, if the popula- 
tion distribution, along with the fitness landscape and 
transition probabilities, all obey a complementarity sym- 
metry, then we may express the quasispecies equations in 
a more convenient form. Again, the derivation has been 
previously worked out in |18j . so we simply present the 
final results here. For random segregation, we have, 

= + K *)yz + Y K S"V3" x 

u" 

N 

se{o,i} N i=i 

(15) 



For immortal DNA strand co-segregation, we have, 
-Jp = -(/c(i) + ng)yg + ^2 K s"Va" x 

B" 

N N 

[nKK;0,K,^)) + nKK';0,(^^D)] 
i=i 1=1 

(16) 

B. The Infinite Sequence Length Equations 

We now proceed to determine how the random segrega- 
tion and immortal strand co-segregation equations look 
in the limit of infinite sequence length. In doing so, we 
will consider fitness landscapes that have certain proper- 
ties that will allow for a considerably simplified version 
of the equations. The assumption of infinite sequence 
length is a common one in quasispecies theory [3, , and is 
simply a mathematical formalization of the assumption 
of very long genome lengths. 

1. The master genome and homologous groups 

To begin, we assume that there exists a "mas- 
ter" genome, a = {{<7 ,i, 00,1}, . . . , {<7 ,iv, ct ,at}}, 
that has the wild- type fitness k > 1. This mas- 
ter genome consists of M distinct strand-pairs, de- 
noted {(To.ii , CTo.ii })•••, Wo Am ! °0,iAf }j where the master 

th 

genome consists of rik pairs of the k strand-pair, so 
that N = n\ + ■ ■ ■ + um- We define the hfl 1 homologous 
group of the master genome to be precisely the copies 

of the fc^ n strand-pair, {co,i k ,o^o,i k }- 

We also let denote the length, or the number of 
base-pairs, in {cro,j fc , &o,i k }■ The total length, L, of the 
master genome, is then defined to be L = n\Li + • • • + 
UmLm- We then define a*; — L^/L. 

We assume that, during replication, daughter strand 
synthesis is not error-free, and is characterized by a per- 
base mismatch probability of e. We then allow the total 
sequence length, L, of the master genome to become in- 
finite, while keeping /1 = eL to be constant. Physically, 
this corresponds to maintaining a constant replication 
fidelity in the limit of very large genomes. This is a com- 
mon assumption in quasispecies models, and reflects the 
fact that the average number of mutations per genome 
per replication cycle, as measured by fi, is generally far 
smaller than the size of the genomes themselves [3j IH [5] . 

In the limit of infinite sequence length, we may make 
the following assumptions about the master genome: For 
any two indices k ^ I, we have that, 

£ , h(o'o,»(,>^o,»i) = 00 

DH(vo,i h ,&o,ii) = 00, k ^ I (17) 

where Dh(o~i, <7 2 ) denotes the Hamming Distance be- 
tween any two sequences (the Hamming distance is the 
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number of positions where the two sequences differ) . 

To understand the basis for these assumptions, we may 
note that, in the limit of infinite sequence length, a given 
sequence will, on average, differ from its complement at 
an infinite number of positions [BJ. Also, since we do not 
assume any kind of correlation between the homologous 
groups, we assume that the strands from distinct homol- 
ogous groups also differ from each other at an infinite 
number of positions. 

We now consider an initially clonal population consist- 
ing entirely of the master genome that is allowed to repro- 
duce and evolve. After some time, consider some strand- 
pair {a, a'} in some genome of some organism. Suppose 
that this strand pair has the property that, for some k, 
both Hamming distances Dh (a, cr ik ) and Du{a' ,UQ t i k ) 
are finite. In this case, we say that {a, a'} belongs to the 

k^ 1 homologous group. Then it follows that Dh (o~, cro.u), 
D H (a,a 0til ), DH(a f ,a ,i l ), and D H (a' ,a 0jil ) are infi- 
nite for I 7^ k. It also follows that Dh (<t, ao,i fc ) an d 
Dh{g', o~o,i k ) are infinite, and that Dh{o~,o~') is infinite. 
As a result, a given strand-pair can only belong to at 
most one homologous group. 

When {a, a'} replicates, both a and a' act as tem- 
plates for the synthesis of the complementary daughter 
strand. Because \x, only a finite number of mismatches 
will occur in both daughter strand syntheses. As a result, 
if a produces <j\, with daughter a 2 , then we have that 
Dh{o\, a o,i k ) and Dh{o~2, <?a,i k ) are both finite. A simi- 
lar result holds for the daughter strand-pair produced by 
a'. Note then that the daughters of {a, er'} also belong 
to the k^ 1 homologous group. 

Consider a genome where, for k = 1, . . . , M, there are 

tin 

exactly n k strand-pairs belonging to the k homologous 

group. These n k strand-pairs produce, upon replication, 

2rik strand-pairs belonging to the fc^ n homologous group, 

which then segregate equally into two daughter cells, so 

that each of the daughter cells have exactly n k strand- 

th 

pairs belonging to the k homologous group. By in- 
duction, it follows that, if we begin with a clonal popu- 
lation consisting entirely of the master genome, then for 
all times all genomes in the population will have, for each 
k = 1, . . . ,M, exactly n& strand-pairs belonging to the 
fc tn homologous group. 

2. Viable chromosomes and the fitness landscape 

A given genome is taken to have the wild-type fit- 
ness of k if each homologous group contains at least one 
functional, or viable, chromosome. Otherwise, the fit- 
ness is taken to be 1. To completely characterize the 
fitness landscape, we therefore need to properly define 
what we mean by a "viable" chromosome. So, consider 
some strand-pair, {a, a 1 }, that belongs to the fc^ n ho- 
mologous group. Then we either have that Dh{o~, co,i fc ), 
D H (a',a 0jik ) are finite, or D H (a', a 0ti J, D H (a,a 0Ak ) are 
finite. Let us assume that the former case holds, since 



the two cases are completely equivalent. 

Then let Iq denote the number of base-pairs where a 
and a' are complementary, but where a and a' differ from 
<7Q j i k and CTo,i fc ! respectively. Let li, denote the number 
of base-pairs where a differs from <7Qi h , but where a' is 
identical to <To,i fe - Let Ir denote the number of base- 
pairs where a is identical to <7o,i fc i but where a' differs 
from oo,i fc . Finally, let Is denote the number of base- 
pairs where both a and a' are non-complementary, and 
differ from ctq^ and <xo,i fc i respectively. 

Then the strand-pair {a, a'} is said to be "viable" if 
and only if Iq = Ib = and II + Ir < l* k , where l* k is 
a function of the homologous group number. The idea 
here is that if either lc or Ib are positive, then there are 
regions of the chromosome where sequence information is 
lost, rendering the chromosome non-functional. However, 
where one strand differs from the master sequence but the 
other strand does not, sequence information is preserved. 
If there are not too many such mismatches, or lesions, 
then the cellular enzymatic machinery can recover the 
master sequence information, rendering the chromosome 
functional. 

This fitness landscape is of course a great oversimpli- 
fication of actual fitness landscapes. Nevertheless, it is a 
useful first approximation with which we can obtain ana- 
lytical results from our evolutionary dynamics equations. 



3. Population classes, lesion repair, and the infinite 
sequence length equations 

The master genome gives rise to 2 7V iV!/(n 1 ! x 
••• x rijvf!) ordered strand-pair vectors, given by 

{t S1 ((T ,w n (1) ,&0,ir N (l)), ■ ■ ■ ,T SN (<T ,Tr N (N),V0,n N (N))), 

where s— (si, . . . , sjv) £ {0, 1}^, and tin € t^n(^o)- We 
may use this ordering to group the ordered strand-pair 
vectors into classes, as follows: First, we pick an ordering 
for the set of permutations 7rjv(i7o), and list them in 
some order 7nv,i, kn,2i ■ ■ ■ ■ Also, given a s S {0, 1}^, we 
define k to be the number that s represents in binary 
notation, so that k — si2 JV_1 + si2 N ~ 2 + • • • + sjy. 

Given an ordered strand-pair vector a — 
((01, a[), . . . , (<Xjv~, Cjv))) w e say that a belongs to 
class (n, k) if, for each i = l,...,N, we have that 
D H(<ri,<ro,ir Nin (i)), D H {a' i ,a ^ N are finite if s t = 0, 
or D H (a i ,a 0i7rN ^ {) , D H (a <J a ^ N n{l) ) are finite if 
Si = 1, where (si, . . . , sjv) is the binary representation 
of k as stated above. 

We make the following claim: If we start with a clonal 
population consisting entirely of the wild-type (i.e. the 
master genome), then all genomes produced by the evo- 
lutionary dynamics of the population give rise to ordered 
strand-pair vectors belonging to a unique class. To prove 
this, we must show that all genomes produced by the 
evolutionary dynamics give rise to ordered strand-pair 
vectors belonging to some class, and then we must show 
that a given ordered strand-pair vector cannot belong to 
more than one class. 
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We have already shown that all genomes produced 
by the evolutionary dynamics of the population give 
rise to genomes which have chromosomes belonging 
to the fc^h homologous group for each k — 1,...,M. 
Let us then consider some ordered strand-pair vector 
a = ((<7i, <j'x), . . . , (&n, fjv)) generated by some genome 
in the population. The ordered strand-pair (<7j,<7^) is 
generated from the strand pair {<7j,<7j}, which in turn 
belongs to some homologous group as defined above. We 
say that (ui,^) belongs to the same homologous group 
as {o-j,^}. 

So, for each homologous group k, let i^ i, . . . , ik,n k 
denote the indices of the ordered strand-pairs belong- 
ing to the fc^ n homologous group. Consider then some 
i € {*fe,ii • ■ • > *fc,nfc}; an d let us consider the pair of 
Hamming distances D H (a i ,a . 7VN ^ i )), D H (a' i ,a ^ N ( i )), 
and D H (a i ,a . 7lN ( i )), D H (a' i) a , 7rN (i)). If the first 
pair of Hamming distances are finite, then the sec- 
ond pair is infinite, and vice versa. However, unless 
{>o 

,7rjv(i)i °"o,7rjv (i) } is equal to {oo.ifc j "o.ifc }j the master 
ordered strand-pair of the fc tn homologous group, then 
both pairs of Hamming distances are infinite. There- 
fore, in order for each ordered strand-pair (crj, a' t ), where 
* G {*fc,ij • ■ • j to have the property that either 

-Dfl"(o'i,O'0,7rj,(i))j -Dif(o"i,O-0,irjv(i)) Or Dh{<Ti, 0O,7rjv(i))j 

D# (cr^, cr 0)7rjv (j)) are finite, it must follow that 7Tjv must 
be a permutation that sends the tih master strand-pairs 
associated with the A; tn homologous group to the indices 
{ifc,i, . . • ,ik,n k }- In order for this to hold for all the ho- 
mologous groups, it follows that ttn must be the unique 
permutation that sends, for each homologous group k, 
the rife master strand-pairs to the indices {ik,i, ■ ■ ■ , ik,n k }- 
We let 7r/v.n denote this particular permutation, where n 
represents the position of this permutation in the order- 
ing of the permutations of ttn{&o)- 

Now, for a given i G {*fe,i> • • • > ik,n k }i we nave 
shown that the pair of Hamming distances Z?#((X;, <Jo t i k ), 
£ ) ff(oi)Oo,i fc )i and D H (ai,a 0>ik ), D H {a' l ,a 0Ak ) cannot be 
simultaneously finite. If the first pair of Hamming dis- 
tances is finite, then we have Si = 0, while if the second 
pair is finite then we have Sj = 1. If we let k denote the 
number that (si, . . . , sn) represents in binary notation, 
then we have that a belongs to the class (n, k). Note by 
construction that (n, k) must be unique. 

Let us now consider the random chromosome segrega- 
tion equations, and let us consider some vector of ordered 
strand-pairs a belonging to class (n, k). If we look at the 
sum in the equations, we notice that we have a prod- 
uct of terms, each of which is either a"), (tij, a[)) 



or p((a"; <?"), ct,)). For the first probability to be 
non-zero, we must have that £)jy(cf,(7j) be finite. This 
implies that erf must be a finite Hamming distance away 
from the same master strand to which Oi is a finite Ham- 
ming distance away, and so the ordered strand-pair with 
which a'l is associated must belong to the same homolo- 
gous group as (o"j,(T^). If we let (n',k') denote the class 
to which a" belongs, and if we let (n, k) denote the class 
to which a belongs, then we must have that n' = n and 
k! = k, and so a" belongs to the same class as a. 

Now, for the probability p((cr", a"), (ct^, CTj)) to be non- 
zero, we must have that Dufa", crQ is finite. Since a[ is 
a finite Hamming distance away from the complement of 
the master strand to which <ii is a finite Hamming dis- 
tance away, we have that a\ is a finite Hamming distance 
away from the master strand to which ai is a finite Ham- 
ming distance away, and so a" is also a finite Hamming 
distance away from the master strand to which oi is a 
finite Hamming distance away. Following a similar argu- 
ment as before, this implies that a" belongs to the same 
class of a. 

As a result, for random chromosome segregation, we 
need only consider contributions from ordered strand- 
pair vectors that are in the same class as the daughter 
ordered strand-pair vector. 

Now let us consider immortal strand co-segregation. 
For the probability a"), (a i: a^)) to be non-zero, 

we have that Dft(cr",<Ji) must be finite, and so, fol- 
lowing a similar argument as before, we obtain that a" 
must belong to the same class as a. For the probability 
p({a'"\&"), (ai, <j'i)) to be non-zero, we must have that 
Dnip'l' ,Gi) is finite, and so a'/' must be a finite Ham- 
ming distance away from the complement of the master 
strand to which Ui is a finite Hamming distance away. 
Therefore, er" must be a finite Hamming distance away 
from the master strand to which &i is a finite Hamming 
distance away, and so we obtain that a" must belong to 
the same class as a. 

As a result, for immortal strand co-segregation, we 
need only consider contributions from ordered strand- 
pair vectors that are in the same class as the daughter 
ordered strand-pair vector. 

At this point, the random and immortal strand segre- 
gation equations for arbitrary genomes become formally 
identical to the equations for haploid genomes. Since 
these equations have already been derived in |18j , we ob- 
tain, that the infinite sequence length equations are, for 
random chromosome segregation, 
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dz 



((Ic,i,0.1i,0),-,(Ic,n,0,In,0)) 

dt 



"( K ((ic,i.OJi,0),...,(; c ,«,0,/jv,0)) + K (t)) z ((lc,i,O,li,O),...,(lc,N,O,lN,0)) 



i N /c ' 1 i \ - 

1 1 -ju(l-A/2) TTr~ ..^1 wih. 1 , Aa i^\ 



2"-U 1 !---/ Ar ! 

ic,i-'i,i lc,!f— lx,n oo 



7' r 9 



r.V^ 2 



X E "' E ' " E " ' X, K (( i ci-ii, 1 -ii, 1 ,'i, 1 ,'! i , 1 ,0),...,(ic,iv-i' liJV -ii iN ^, JV ,^ iJV ,0)) 

^2,1 ~0 ^2,iV~0 ^3,1~^ ^3,iV~^ 

X ^((^C t l-^ a -^2a^2a' / 3,l'°)---( / C 7 JV-^ iJV -/2,iV' / 2,iV^3,JV' )) 



(18) 



where n is the number of strand-pairs with li > 0. Note 
that we do not use the cti symbol, but rather at. The 
reason for this is that a,i refers to Li/L, where Li is the 
length of the master chromosome of the i^ 1 homologous 
group. Here, on refers to Li/L, where in this case Li 
is the length of the i chromosome in the chromosome 



ordering associated with the given class of vectors of or- 

th 

dered strand-pairs. If the i chromosome belongs to the 

th 

k homologous group, then &i = a^- 

For immortal strand co-segregation, we have that, 



Z((l C ,l,0,h,0),...,(lG,N,0,lN,0)) 



dt 



= - M (l-A/2) 



-(K((lc,x,0,h,0),-,(lc,N,0,lrf,0)) + K (t)) z ((l c ,i,0,luO),...,(lc,N ,0,l N ,0)) 



JV 



\ 1 Aai/U^ 



n^a-A)]'* e tti( 



IC.N . „ 

l' =o 1 - Ar ' 

'l,N u 



OO OO 

X] "' X K ((ic,i-'i,i' ' / 2,i^).-.('c,«-ii, JV :0,i^ lv ,o))2;((i c , 1 -;' lil ,oj^ 1 ,o),...,(i c , JV -r 1 ^o^.^o)) 



1 



N 



h\...l N \ 



-M(l-A/2) 



1,JV 



E A/ "((icM-ii.l-ii.nO.i^, 



i' =0 !' „=0 



i, 1 ,0),...,(tc,w-<i >w -Ji, N ,0,<i, w .O)) z ((<c.i-'i,i- J 2,i. ' , 2,i.O) Vc,N-ll N -l' 2 , N ,0,l' 2 , N ,0)) 

(19) 



Here, we define z {{ i c lU ltlR U i B l)) ,„ A i CtN ^ L N ^ R N ^ B N)) 
to be the fraction of vectors of ordered strand- 
pairs in the population, belonging to a spe- 
cific class, characterized by the parameters 
({lc,i, Ir,i,Ib,i), ■ ■ ■ , (lc,N, Il.Ni Ir,nJb.n)), where 
lc,i,lL,i,lR,i,lB,i) refer to the values of Ic,Il,IrJb for 
the i^ 1 strand-pair, respectively. 

The parameter A is a lesion repair probability, and 
is the probability that a given mismatch that survived 
all error repair mechanisms associated with the replica- 
tion process (e.g. proofreading and mismatch repair) will 
eventually be eliminated by the lesion repair machinery 
of the cell. Here, because there is no longer any discrim- 



ination between parent and daughter strands, if a given 
mismatch is eliminated, then there is a 50% probability 
that the original base-pair will be restored, and a 50% 
probability that a mutation will be fixed in the genome. 

For random chromosome segregation, we are able to 
show in |18j that vectors of ordered strand-pairs pairs 
with Iba > cannot be produced through replication, 
hence we may assume that Ib.% = 0. Furthermore, we 
can show that II^, Ira cannot be simultaneously greater 
than 0. We only show the equations allowing for Ira > 0, 
since the equations where we allow Ila > are identical. 
For those values of i for which In > and Ir a — 0, we 
have that U represents the value of I la- The equations 
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that follow are then identical to what is written above. 

For immortal strand co-segregation, we are able to 
show in [TH] that vectors of ordered strand-pairs with 
Ib,i > cannot be produced through replication, hence 
we may assume that = 0. Furthermore, we can show 
that 1 1,^ must be as well. 



4- Perfect lesion repair 

In contrast to the haploid case, solving for the steady- 
state mean fitness of the general case turns out to be 
considerably more difficult. We have therefore decided 
to solve for the steady-state mean fitness for the specific 
case where A = 1. This assumes perfect lesion repair, so 
that we are dealing with genomes where each chromo- 
some consists of perfectly complementary DNA strands. 
In this case, it may be shown that both the random and 
immortal strand co-segregation equations reduce to, 



dt 



-(«(Ji,...,iw)+«(*))Z(li,...,lAr) 



XK(h-l' 1 ,...,l K -l' I/ )Z(h-l' 1 ,...,l N -l' N ) 



N- 



(20) 



where we have changed notation so that z^,...,^) 
in the notation of the previous equation refers to 
z ((h, o,o.o),. ..,(;«, o.o.o)) of the random and immortal strand 
segregation equations given earlier. 



III. RESULTS AND DISCUSSION 

In this section, we will obtain the steady-state mean 
fitness for the fitness landscape defined in the previous 
subsection. For populations within a given class, we 
begin by defining zi^ i k \ to be the total fraction of 
vectors of ordered strand-pairs where the chromosomes 
with indices i\, . . . , i& are non- viable, while the remain- 
ing chromosomes are viable. To define this population 
fraction more formally, we introduce the following no- 
tation: We let ei, e2, . . . , ejy denote the standard or- 
thonormal basis of R , so that <=i = (1, 0, 0, ... , 0), = 
(0, l,0,...,0),...,ejv = (0,0,..., 0,1). We then have 
that, 



oo oo 

z{i u ...,i k } = ••• z k 1 &i+-+k k e k 

h 1= i u k =i 
From Appendix B, we may then show that, 
dZl={ii,...,i k } 



(21) 



dt 



= -(Kj + K(*))2j 



+ 2 e -E ie{ i,.. .,«•}/; 5i W 2 X 

Dn( i - e ~ aW2 )^/^/-> 



(22) 



where m is defined as the fitness of vectors of ordered 
strand-pairs where the non-viable chromosomes are of 
indices i\, . . . , i/.. 

Now, the fitness does not depend on the specific indices 
that are knocked out, but rather the homologous groups 
to which each set of indices belong. If we let T,; denote the 
indices corresponding to the homologous group i, then 
given a set of knocked out indices /, we may define Gi = 
I f] Tj to be the subset of knocked out indices belonging 
to homologous group i. We then have that, 

d *Gi\J-\JG M , . 

j t = -(K Gl u- U G M + K \ t )) Z G 1 u- U Gm 

+2e~( 1 ~ miai m Jn«i(W 2 x 

M 



e ••• e m 

GJCGi G' M CG M n=l 

k g x /G{ u- U G U /G' U ZG 1 /G' 1 U- U G M /G' M 



(23) 



where is the number of indices in Gi, so that mi = 
o(Gi). 

Now, define z(mi,...,fflM) to be the to- 
tal population fraction of genomes with m, 
knocked-out chromosomes from the i^ 1 ho- 
mologous group. That is, z(rrii, . . . , mjv) = 

J2G 1 Cr 1 ,o(G 1 )=7n 1 ' ' ' X,G M Cr M ,o(G M )=roM Zg i U" U Gm ■ 

We then have, from Appendix B, 

dz{m 1 ,...,m M ) , . , , 
— = -(«(mi, . . . ,m M ) + K[t))z[mi,. ■ ■ ,m M ) 

+2e _ *- 1_miQl m m a M x 

mi 

E 



(1 _ e-ai^/ajmj x ... x (1 _ g-« M ^/2)m' M x 



m' M =0 

n\ — ni i + m\ \ i a \ j — in \j /// w 

X • • • X "IX 

m\ ) V m M J 

n(mi — m' l7 . . . , tum ~ m M) z ( m i — m 'n ■ ■ ■ i m M — ™m) (24) 



fn M - m M + m'b 



jci ie.J 



Now, at steady-state, let m* denote the smallest 
value of mi + ■ ■ • + tom for which there exists a 
z(mi, . . . , tum) > 0. Then given mi, . . . , m^j for which 
z(m\, . . . , Tom) > and m* = mi + • • • + mM, we have, 
at steady-state, 

0= [{2e- {1 - miai mM«MW2_ i) K ( mi) ... >mM ) -a] x 

z(mi,...,m M ) (25) 

which implies that k = 

K(mi,...,m M )(2e^( 1 - miQl ™ MaM )^/2 _ ^ Thc 

reason for this is that, in the sum in Eq. (24), if 
z(m\ — m'i, . . . ,mM — m' M ) > 0, then by definition of 
m* we have that (mi — m'i) + ■ • • + (m^ — m' M ) > m* 
m* — (m[ + • • ■ + m' M ) > m* =>■ m[ + ■ ■ ■ + m' M < Q 
m[ — ■ ■ ■ = m' M = 0. 

Now, we also have, from Eq. (24), for arbitrary values 
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1 2 3 4 5 



|X 

FIG. 1: A plot of k versus /i comparing both the analytical 
expression for re (solid line) with the values obtained from 
stochastic simulations (dots). We have M = 2, m = rii — 1, 
Li = L2 = 10. We took a population size of 1, 000. 




1 .... 1 .... 1 .... 1 .... 1 .... i ... . 

1 2 3 4 5 6 

FIG. 2: A plot of re versus /1 comparing both the analytical 
expression for re (solid line) with the values obtained from 
stochastic simulations (dots). We have M = 2, m = ri2 — 2, 
Li — L2 — 10. We took a population size of 1, 000. 



of mi, . . . , mjvf) that 
dz(mi, . . . , tum) 



> [re(mi,...,m M ) X 



(2e- {1 - miai rn M auW2 _ y _ S (f)] z ( mi) . . . ;TOm ) 

(26) 

and so, for the steady-state to be stable, we must have 

that R> it(mi, . . . ,m M )(2e- {1 - miai m M a M )M/2_ 1 )_ 

Combined with the fact that equality holds for some set 
of values of mi, . . . , m^, we then have that, 

K = max{K(m 1 ,...,m M )(2e- (1 - miQl wl^-l)} 



c{fe(2. 



,(aH ha M )M/2 



-1),1} 



We compared the results of our analysis with results 
obtained from stochastic simulations of replicating pop- 
ulations. These are shown in Figures 1-3. Note the ex- 
cellent agreement between the analytical expression for 
the mean fitness and the numerical results obtained from 
the stochastic simulations. 



IV. CONCLUSIONS 

This paper developed the semiconservative quasis- 
pecies equations for polysomic genomes. In contrast to 
previous work |18j , the quasispecies equations developed 
here are not restricted to haploid genomes, but rather 
may applied to diploid and even polyploid genomes. 

By an appropriate transformation of variables, these 
generalized equations may be recast into a form that 
makes them formally identical to the equations developed 
for haploid genomes. However, because of the existence 
of identical copies of chromosomes in polyploid genomes, 
we were unable to obtain an analytical expression for the 
mean fitness for the case of arbitrary lesion repair, as 




(27) 



FIG. 3: A plot of re versus /j, comparing both the analytical 
expression for re (solid line) with the values obtained from 
stochastic simulations (dots). We have M = 2, m = ri2 — 5, 
Li = L2 = 10. We took a population size of 1, 000. 



we were able to do for the haploid equations. This of 
course does not mean that an analytical expression does 
not exist. It simply means that obtaining an analyti- 
cal expression for the mean fitness is considerably more 
difficult for the polyploid case than it is for the haploid 
case. We therefore solved for the mean fitness for the 
case of perfect lesion repair (A = 1), which was consid- 
erably more tractable than the general case, leaving the 
case of arbitrary lesion repair for future work. 

The mean fitness results obtained from stochastic sim- 
ulations were found to be in excellent agreement with the 
analytical results that we derived for the fitness landscape 
that was considered in this paper. We find that beyond 
a critical mutation rate, the population becomes entirely 
non- viable, with a low fitness of 1. This signals the onset 
of the error catastrophe, whereby natural selection can 
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no longer localize the population about the fast replicat- 
ing genotypes, and the result is dynamics governed by 
pure genetic drift. 

In the end, we regard this work as a "methodology" 
paper, in the sense that its purpose it to extend the 
quasispecies formalism developed for haploid genomes 
to deal with more complicated genomes. The analyt- 
ical solution obtained for our chosen fitness landscape 
and lesion repair probability, along with the stochastic 
simulation results, are meant to confirm the validity of 
the master equations (Eqs. (7) and (13)). Therefore, 
future research will involve using these equations, along 
with similar equations developed in [TOl HE] , to model the 
evolutionary dynamics in asexually replicating unicellu- 
lar populations. In particular, these equations could be 
highly useful for modeling mutation-propagation in stem 
cells and tumors, and could therefore be relevant for can- 
cer modeling and aging. In this vein, an additional exten- 
sion of our model that will need to be considered is the 
incorporation of genomic instability into our framework. 
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APPENDIX A: DERIVATION OF THE FINITE 
SEQUENCE LENGTH EQUATIONS 

In this Appendix, we derive the finite sequence length 
equations for both random segregation and immortal 
strand co-segregation. 

1. Random segregation 

We begin with random segregation, and our goal is 
to initially derive equations for the x„ population frac- 



tions. Although the chromosomes are formally indistin- 
guishable, for purposes of the derivation we can assign 
an arbitrary chromosome ordering to every genome. The 
only requirement is that once an ordering is chosen, we 
use that same ordering for the given genome. Similarly, 
for each chromosome in a given genome with an assigned 
ordering, we can tag one of the strands with a "0" , and 
the other with a "1" . Again, this tagging scheme is ar- 
bitrary; however, once chosen, must be consistent. This 
chromosome ordering and strand tagging scheme allows 
us to appropriately keep track of the chromosomes during 
the replication process. 

During the replication process, every strand of every 
chromosome serves as the template for the synthesis of 
a daughter strand, and therefore of a new chromosome. 
For convenience of the derivation, we assume that each 
new chromosome segregates into a left daughter cell and 
a right daughter cell (relevant figures may be found in 
|18j). For a given parent chromosome from the original 
cell, the chromosomes formed from the "0" strands and 
the "1" strands segregate into opposite cells. Since chro- 
mosome segregation is random, each chromosome has a 
50% probability of segregating into a given daughter cell. 

Note that for a parent genome 

rr "(0) r "(0) "(l)n • A C 

parent strands cr"' Sl , (t'^ Sn \ with each Si — 0,1, 
can only produce the genome {{<7i, a[}, . . . , {ct/v, °at}} 
if the parent strands a'^ Sl \ . . . ,a'^ SN ^ respectively 
produce {(7^ n{1) , • • • , {^(W), < y(J v)}' where 

7r 7v denotes a permutation of the strand indices. Note 
that when considering the set of permutations of the 
strand indices, we have to consider those 7Tjv € 7T/v((t), 
where ttn(o') denotes the set of all permutations of the 
strand indices so that all the ordered strand-pair vectors 
(K^(i)/ W (i)}. • ■ • . KatW'^w}) are distinct. If 
we consider all permutations, then since some chromo- 
somes are identical, we will be over-counting the total 
contribution. We then have, 
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= -(«(*) + Ko-)x,j + ^2 K&ttX&i X 



*"={{a" 

N N 



«i=0 sjy=0 7rjvG7Tjv(o-) *=1 *=1 



= -(«(*) + Kct)^* + Ka"a:a-" 



-,/r r "( ) "Wi /_"(°) „"W\\ 



x^i E E'"En^' ,s ^'V%(.)-< ( .)}) 

ttn £w]v (a-) «i=0 s N =Oi=l 

= -(«(*) + + E K*"^" x 

^ ={{ ^(0) ><( i) }i ... ;{(7 »(0) ;<(1 , }} 

E Ilbt^^'OJ^w.vwll+pt^V^K^^w})] ( A1 ) 

Tjv6irAr(o') i=l 



which is equivalent to Eq. (3). some representation. We obtain, 

We next derive the equations for the ordered chromo- 
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-(n(t) + K a )Xa + N _ x E Kct" -7-j 



ni! x • • • x n m \ 

l%a" T7\ X 



dt vw a> a 2 W -! ^ a n'A x ■■• x n' A a N\ 

a" m 
N 



5-" Tr' N £Tr N (<j") 



N 



= -{K{t) + K 9 )xt+ 1 E E K (K\.,,,,<';_nj,....{<v, M1 ,<r fW1 }) x (KVnv<'; ,,: !■-". . 



<t" 7T^eirjv((T") 



A' 



iY 



e n^^w^")' Wf(o>C(«)}) +p(( <T ^(i)^")i {^w.vt,)})] 

1 W 

= -(«(*) + n 9 )x 5 + 2N _ lm E E K S (i)i °i N (%)}) + {^(i), 

= -(«(*) + + jv_i AT , E E ,"'"-1 },-,W-i >°"'-i }) X iW-i },..;W-i >"'"-! » 

AT 

s"), to, «#) + p(«l 1(i) ; O. K <#)] 

t=i 

1 w 

= -(R(t) + K S )a; 5 + j- E IIWW; & ")' + KW'i fo, <))] 



i=l 



(A2) 



where 5Z„. denotes the sum over all permutations of 2. Immortal strand co-segregation 

the indices 1,...,N. From these equations, which are 
equivalent to Eq. (5), the passage to the vector of ordered 
strand-pairs formulation of the dynamics is identical to 

the derivation in |18j . For immortal strand segregation, we initially have, 
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dX a <N/N) 

— = -{Ks- + K(t))X a (N/N) 

dXfr(T/N) . -I \\ 1 

= -(Ks- + K(t))X a (T/N) + 2^ K a/ >X a „(N/N) X 

S"(N/N) 

1 1 N N 

e ••• e e [nM(^ ) ;^,K Wl 4( i )))+n^^ <s,+l ^'')'K(^<w))] 

Sl =0 SAr=0,rjve?rjv(<T (T/JV) ) i=1 i=1 

iV JV 

+ ^ K^x a „ ( T/iv) ^ [np((^' (o) ^")> + n^(( <7 i ,(i) ^")' ( a w(i)' 

ct"< t /«) TrjvGTTjvta-fr/")) »=1 »=1 

= -(«5- + K,(t))x a (T/N) + 2jv _ 1 ^ Ks"X a ,,(N/N) X 

a"<«/«) 

E [nb((< (o) ;^0,K N w,<( i )))+p((< (1 ^ / 0,(^w,< w ))] 

TrjveTrjvfCT^/")) i=l 

iV AT 

+ Ksxxs^T/N) [n^^^j^'^wXwwJj+n^^^^^'^w'^w))]^) 



which is equivalent to Eq. (8). 

Converting to the ordered strand-pair formulation of 
the dynamics, we have, 



dX a (N/N) 

-J = -(KS + K(t))X a (N/N) 

7 - N 

dX a (T/N) 



= -(n a + K(t))x a(T/ N) + — ^— j- Yl K a „x a „ ( N/N) \\[p((<y'l; a"), (o u cr-)) + p((<r-"; a"), (<r i; cr-))] 



f ]vT E ^" M X ... XT / i x ^"> E n 1 \x---xn m \x 

N N 

"(0). AfKT/N)^ („ ... „/ . ^,TT^C A" 



»=i i=i 

i w 

= -(/C ? + K(t))^(T/iV) + ^^3- K^2^„(iv/iV) J|b(((T-'; 5 '") J (^,CT-)) +p((0--";ct"), {<7i,Oi))] 



CT"(W/iV) j=l 

+ —7 > > K, r //(o) -, t i/(o) ir(i) T\X, t ma) i/(i) -, t i/(o) //(I) in X 

^ ! <e7lN ^„( T/N)) ( Kv< 1 )'^> } '---' {ff ^w^<"> }) ({ ^( 1 )'^< 1 ) } '-' {CT ^(«)^W(«) }) 



7TJY i — 1 i — 1 



-(k^ + «(i))a; S (T/jv) + ^^3- k^>x s ,,(n/n) ^\\p((o'( ; cr"), (oi, o--)) + p((of'; ct"), (<?i, 

a-"( N /«) i=i 

JV N 

Y K a „x a „( T/N ) IHp^;*"), (o u + V"), (a u a',))] (A4) 

j"(T/N) i—l i=\ 
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which is equivalent to Eq. (9). Following the derivation We now make use of the following identity: 
in [18], we may then obtain Eq. (13). 



APPENDIX B: DERIVATION DETAILS FOR THE 
STEADY-STATE MEAN FITNESS 

Using the definition for Z{ il ik } provided in the main 
text, and from the infinite sequence length equations, we 
have that, 

E-EE-E II )l '>* 



E-EM o= E 

E ••■ E (B2) 



2 where the sum includes the empty set and contributes 

hl =i h k =ii[=o l' ik =0ie{i u -,i k } v the term /(0,...,0). 

K (i. 1 - i n)«.i+-+(^ fc -^ fc )^ fc Z ( i n-'n)^ 1 +-+( i . fc -^ fc )^ fc 

(Bl) So, we obtain, 



j <1= i ii fc =i {ix,— ,i»}c-[i x * fc } 4^=1 i;„=n,e{3i,..j„} *' 

K E ie o 1 ,..., 3 „}( i <- i i) g i+E i6{i i in}^ 2 ^^^! J „}( i '-^)^+E ie{si ,...,, fc}/{J1 ,..., J „} i ^. 

= -(«{<!,...,»»} + «(*))«{<!,...,<»} + 2e^ /2 x 



E E- E E-E E-E n 

{ji,--M3»}C{^ ) ...,i fc } ) {/ii,... ) /i m }={ii,...,i fe }/{ii,...,3„}ih 1 =l l hm =ll H =l l jn =ll' n =l l' jn =li€{ji,...,j n } 1 

K E ie o 1 ,..., 3 - n} a<-i' i )ei+E ie{ i 1 ,..., iii }/ { j 1 J „ } '^. 2; 5: ie{J1 J „}( i «-^)e ! +E !e{ll ,...,, fc}/{J1 jn y l * & * 

= -( K {h,-,i k } + K(*))*{i I ,.-,i»} + 2e_M/2 X 

E E E n ^T) z: x 



{il,-Jn}e{H,-,»k},{ftl, — ,hm}={h,-,ik}/{jl,-,3n} l' n =l — iJ'n} * 



5^ ^Ese{Ji,..-,Jn} 



(J*-«' < )ei+Ei e{ i 1) ...,i 



he t X 



= -(« { i 1 ,...^ } +K(t))^ {il ,..., ife} +2 e -^ 2 J] J] (e^/ 2 -l)x 

{jl,—iJn}£{*l !—)**} »6{j'li — >in} 

E K {fflv-Mfl P }U{<lv",ifc}/{ilv.-,3n}^{fl 1 ,...,flp}U{<lv".i«,}/{jl,--->J„} ( B3 ) 



Defining I = . . . , i^}, we have that the sum in the final expression may be re-expressed as, 



1G 



HCI G CI/H ieH ieG 

= E E U^-^U^-^/gzi/g 

HCI GCI/H ieH ieG 

GCIieG HCI/GieH 

= e- X ^"o & ^[J[{e & ^ 2 - 1)}k i/g z i/g 

GCI ieG 

= e -(£ ie ,a,W2 £fQ(l - e- 5 ^/ 2 )] K//G z //G (B4) 

GCI ieG 

I 

and so, substituting into Eq. (B3) we obtain Eq. (22) in which gives Eq. (24) after some manipulation, 
the main text. 

Now, to derive the dynamical equations for the 
z(mi,...,mjv), we start with Eq. (23) and take into 
account degeneracies, which gives us, 

dz(m lt . . .,m N ) 

— = -((t(mi, . . .,m N ) + K{t))z(mi, . . . ,m N ) 



»■ x. 


•xPM 






\m 1 J 


\m N J 



_|_2 e _ ( 1_m i a i m N a N )fj,/2 I n l \ 

\m\) 

mi m N N , y v 

E : --Ejna-^rt(:;)x...x( :: ) 



mi m^v 

_« „rt,„',/mi \ I tun \ 

x 

J \ m' / \ m',, I 

i' 1= m' N =0 n=l 

K{ mi -m 1 ,...,m N -m N ) - - - - - — 

\mi —m' x ) \rriN — Tn' N ) 



(B5) 
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